Subsegmental, Segmental and Suprasegmental Features for Speaker Recognition Using Gaussian Mixture Model

ثبت نشده
چکیده

In the feature extraction stage, features representing speaker information are extracted from the speech signal. In the present study LP residual derived from the speech data is used for training and testing and also processing of LP residual in time domain at subsegmental, segmental and suprasegmental levels. In the training phase, GMMs are built, one for each speaker, using the training data of the speaker. During the testing phase, the models are tested with the test data. Based on the results with test data, decision is made about the identity of the speaker.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Self Determining Speaker Recognition by Three Level Segmental Processing Of Linear Prediction Residual

This paper proposes a speaker specific source information at different levels.speaker recognition system exploits the source information (LP residual) present at different levels namely subsegmental, segmental &suprasegmental. The subsegmental analysis considers LP residual in blocks of 5 msec with shift of 2.5 msec to extract speaker information. The segmental analysis extracts speaker informa...

متن کامل

Combining Gaussian Mixture Models and Segmental Feature Models for Speaker Recognition

In most speaker recognition systems speech utterances are not constrained in content or language. In a text-dependent speaker recognition system lexical content of speech and language are known in advance. The goal of this paper is to show that this information can be used by a segmental features (SF) approach to improve a standard Gaussian mixture model with MFCC features (GMM-MFCC). Speech fe...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

Speaker Information using Subsegmental and Segmental Analysis of LP Residual

Linear Prediction (LP) residual mostly contains the excitation source information. This work analyzes the LP residual once using frame size of 5 ms (subsegmental) and another time using frame size of 20 ms (segmental), each with a shift of 2.5 ms. The residual frames are then subjected to nonparametric Vector Quantization (VQ) to store the unique excitation sequences for each speaker. The testi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014